Restoration of metabolic functional metrics from label-free, two-photon human tissue images using multiscale deep-learning-based denoising algorithms

Abstract. Significance Label-free, two-photon excited fluorescence (TPEF) imaging captures morphological and functional metabolic tissue changes and enables enhanced understanding of numerous diseases. However, noise and other artifacts present in these images severely complicate the extraction of biologically useful information. Aim We aim to employ deep neural architectures in the synthesis of a multiscale denoising algorithm optimized for restoring metrics of metabolic activity from low-signal-to-noise ratio (SNR), TPEF images. Approach TPEF images of reduced nicotinamide adenine dinucleotide (phosphate) (NAD(P)H) and flavoproteins (FAD) from freshly excised human cervical tissues are used to assess the impact of various denoising models, preprocessing methods, and data on metrics of image quality and the recovery of six metrics of metabolic function from the images relative to ground truth images. Results Optimized recovery of the redox ratio and mitochondrial organization is achieved using a novel algorithm based on deep denoising in the wavelet transform domain. This algorithm also leads to significant improvements in peak-SNR (PSNR) and structural similarity index measure (SSIM) for all images. Interestingly, other models yield even higher PSNR and SSIM improvements, but they are not optimal for recovery of metabolic function metrics. Conclusions Denoising algorithms can recover diagnostically useful information from low SNR label-free TPEF images and will be useful for the clinical translation of such imaging.

Supplementary Discussion S1: While loss functions such as mean absolute error (MAE or L1) and mean squared error (MSE or L2) are often used in other denoising studies, they face limitations on certain images due to the assumption of pixel-wise independence which may not be true in all cases 34,36,63 .To improve image similarity, a SSIM loss function was implemented to minimize image dissimilarity between the denoised and GT 6X images.While promising, blurring and a loss of high frequency variance was observed in denoised images because of the Gaussian filter applied during SSIM calculation (Supplementary Fig. S1).Various filter sizes and sigma values were evaluated prior to selection of a 3 x 3 filter (Supplementary Table S1).Inclusion of pixel-wise loss functions in addition to SSIM loss was hypothesized to enable high visual similarity in denoised images while also preventing the loss of high frequency variance in images.Thus, in addition to MAE (L1) and MSE (L2), we tested three additional loss functions that penalized pixel-wise differences in combination with the SSIM loss function: SSIM + L2, SSIM + the coefficient of determination (R2), SSIM + Frequency Focal Loss (FFL) (Supplementary Table S2) 51 .FFL utilizes Fast-Fourier Transforms (FFT) to calculate the frequency map of each denoised and GT 6X image and penalizes the difference in phase and magnitude between the two images 51 .The impact of these five loss functions on denoising was assessed using the CARE model.MAE, MSE, and SSIM + FFL all led to the generation of visually similar images, yielding similar metrics of image quality (Supplementary Table S3).SSIM + R2 and SSIM + L2 loss functions led to weaker image quality metrics, albeit differences were insignificant across the entire test set (Supplementary Table S3).Differences in image quality were associated with recovered signal from the nucleus and interstitial space with smaller fluctuations in the cytoplasm of cells which varied between loss functions.The small fluctuations in cytoplasmic signal were of particular interest as the calculation of metabolic metrics was associated with NAD(P)H and FAD intensity measurements from the cytoplasm.
While standard metrics of image quality suggested all loss functions yielded similar images, differences in cytoplasmic signal accounted for varying performance on downstream metrics.Images restored by MSE, MAE, and SSIM + FFL all led to statistically significant improvements in β variability correlation between the denoised and GT 6X values (Supplementary Table S2).This was consistent with the improved PSNR of restored NAD(P)H images (Supplementary Table S3).MAE and MSE loss both generated FAD and NAD(P)H images with high PSNR values, leading to 1-5% improvements in recovery of some of the RR metrics.SSIM + R2 was the only loss function to demonstrate statistically significant recovery of RR IQR variability.In comparison to other loss functions, SSIM + R2 loss led to improved FAD image PSNR; however, NAD(P)H image PSNR was poor (Supplementary Table S3).
In principle, the loss functions with the higher PSNR and SSIM values (Supplementary Table S3) led to the recovery of metabolic function metrics that were highly correlated with GT 6X images.However, no loss function led to statistically significant improvement in mean β ( ̅ ).
Supplementary Figure S1: (a Supplementary Table S1 To implement WU-net without convolving the frequency bands, four successive models were independently trained to denoise each frequency band before restoring the denoised image (Supplementary Figure S4).In this implementation, the denoising of images was found to be faster as inputted frequency images (LL, HL, LH, and HH) are a fraction of the size of the original images.Once denoised, images undergo inverse discrete wavelet transformation to restore the image to the original dimensions.For loss calculation, each frequency band output is compared to the corresponding frequency band from the ground truth image.This allows each frequency band model to optimize weights independent of the other frequency bands.The cumulative model is therefore believed to optimally denoise high and low frequency noise in the images.
Supplementary Table S4 S2a) were generated for all z-depths across all 51 test set ROIs.Curve-fitting was completed to extract the mitochondrial clustering power law fit.PSD values at the start and end frequencies of the power law fit were stored for downstream analysis.An array of PSD values was generated for both the WU-net (Wavelet) and CARE (Non-Wavelet) denoised images.PSD values were compared between both models to determine which frequencies were being denoised by WU-net in comparison to CARE.WU-net did not demonstrate any significant difference in the PSD values compared to CARE at the lower range of high spatial frequencies.However, WU-net significantly reduced the PSD value at the highest discrete spatial frequencies of the mitochondrial clustering power law fit in comparison to CARE models (p<0.05).The denoising of the highest spatial frequencies by WU-net demonstrated consistent performance, resulting in more reliable β metric recovery (Figure S2b).In comparison, CARE models demonstrated greater variability which in turn compromised β metric recovery.Together, these results indicate that wavelet-based denoising provides greater denoising of high frequency noise, which improves the consistency of β metric restoration.
) A 290 x 290 μm 2 field of view from a benign cervical tissue biopsy.NAD(P)H and FAD images for the same region are shown along with the corresponding denoised image after using an RCAN model with SSIM loss.Scale bar = 50 μm.(b) A 44.2 x 44.2 μm 2 field of view (white square in a) of two cells.NAD(P)H and FAD images are shown to demonstrate the blurring effect observed when using SSIM loss for denoising.The white arrows demonstrate how high intensity regions are blurred in the denoised image leading to changes in the spatial frequency distribution of the image.Scale bar = 10 μm.

Table S2 :
: Correlation values of WU-net models trained using SSIM + R2 loss with varying gaussian filter sizes and sigma values.Fisher r to z transformation was used to measure significance.**p<0.01Correlation values of CARE models trained using varying loss functions.Fisher r to z transformation was used to measure significance.*p<0.05 and **p<0.01

Table S3 :
Summary of standard metrics of image quality for RAW 1X images and denoised images generated from various loss functions with CARE.Values are reported for mean performance (± standard deviation) across all test set ROIs.Supplementary Discussion S2: 2D-DWT is a mathematical technique utilized to separate spatial frequency information from images.Mother wavelets are applied to extract an "approximate" (LL) and detailed images (HL, LH, HH) from an image 56 .The detailed images are comprised of horizontal (HL), vertical (LH), and diagonal (HH) features from the original image.Each of these four images (LL, HL, LH, and HH) have dimensions that are 50% of the original image, i.e., a 256 x 256 RAW 1X image will output 4, 128 x 128 images.
: Correlation values of WU-net models trained using NAD(P)H data and varying loss functions.Fisher r to z transformation was used to measure significance.*p<0.05,**p<0.01,***p<0.001WU-net based denoising demonstrates improved recovery of b metrics compared to identical CARE models.To understand this phenomenon, a closer examination of power spectral density (PSD) curves and curve-fits were examined.PSD vs. frequency plots (Figure